76 research outputs found
Scaling in Words on Twitter
Scaling properties of language are a useful tool for understanding generative
processes in texts. We investigate the scaling relations in citywise Twitter
corpora coming from the Metropolitan and Micropolitan Statistical Areas of the
United States. We observe a slightly superlinear urban scaling with the city
population for the total volume of the tweets and words created in a city. We
then find that a certain core vocabulary follows the scaling relationship of
that of the bulk text, but most words are sensitive to city size, exhibiting a
super- or a sublinear urban scaling. For both regimes we can offer a plausible
explanation based on the meaning of the words. We also show that the parameters
for Zipf's law and Heaps law differ on Twitter from that of other texts, and
that the exponent of Zipf's law changes with city size
Measuring the dimension of partially embedded networks
Scaling phenomena have been intensively studied during the past decade in the
context of complex networks. As part of these works, recently novel methods
have appeared to measure the dimension of abstract and spatially embedded
networks. In this paper we propose a new dimension measurement method for
networks, which does not require global knowledge on the embedding of the
nodes, instead it exploits link-wise information (link lengths, link delays or
other physical quantities). Our method can be regarded as a generalization of
the spectral dimension, that grasps the network's large-scale structure through
local observations made by a random walker while traversing the links. We apply
the presented method to synthetic and real-world networks, including road maps,
the Internet infrastructure and the Gowalla geosocial network. We analyze the
theoretically and empirically designated case when the length distribution of
the links has the form P(r) ~ 1/r. We show that while previous dimension
concepts are not applicable in this case, the new dimension measure still
exhibits scaling with two distinct scaling regimes. Our observations suggest
that the link length distribution is not sufficient in itself to entirely
control the dimensionality of complex networks, and we show that the proposed
measure provides information that complements other known measures
Do the rich get richer? An empirical analysis of the BitCoin transaction network
The possibility to analyze everyday monetary transactions is limited by the
scarcity of available data, as this kind of information is usually considered
highly sensitive. Present econophysics models are usually employed on presumed
random networks of interacting agents, and only macroscopic properties (e.g.
the resulting wealth distribution) are compared to real-world data. In this
paper, we analyze BitCoin, which is a novel digital currency system, where the
complete list of transactions is publicly available. Using this dataset, we
reconstruct the network of transactions, and extract the time and amount of
each payment. We analyze the structure of the transaction network by measuring
network characteristics over time, such as the degree distribution, degree
correlations and clustering. We find that linear preferential attachment drives
the growth of the network. We also study the dynamics taking place on the
transaction network, i.e. the flow of money. We measure temporal patterns and
the wealth accumulation. Investigating the microscopic statistics of money
movement, we find that sublinear preferential attachment governs the evolution
of the wealth distribution. We report a scaling relation between the degree and
wealth associated to individual nodes.Comment: Project website: http://www.vo.elte.hu/bitcoin/; updated after
publicatio
A Bayesian Approach to Identify Bitcoin Users
Bitcoin is a digital currency and electronic payment system operating over a
peer-to-peer network on the Internet. One of its most important properties is
the high level of anonymity it provides for its users. The users are identified
by their Bitcoin addresses, which are random strings in the public records of
transactions, the blockchain. When a user initiates a Bitcoin-transaction, his
Bitcoin client program relays messages to other clients through the Bitcoin
network. Monitoring the propagation of these messages and analyzing them
carefully reveal hidden relations. In this paper, we develop a mathematical
model using a probabilistic approach to link Bitcoin addresses and transactions
to the originator IP address. To utilize our model, we carried out experiments
by installing more than a hundred modified Bitcoin clients distributed in the
network to observe as many messages as possible. During a two month observation
period we were able to identify several thousand Bitcoin clients and bind their
transactions to geographical locations
Challenges and experiences of a participative green space development in Budapest-Józsefváros
This article is an attempt to present the theoretical and practical backgrounds of a participative green space development in Hungary. The renewed green space, Mátyás square is located in District VIII of Budapest, known as Józsefváros. The neighbourhood of Mátyás square had a very negative image, neglected residential areas extended into the heart of the district suffered by different social problems. The local government of Józsefváros elaborated the so called Magdolna Quarter Programme, that contains the details of the social rehabilitation of surroundings of Mátyás square. In frame of this programme – co-financed by EU through GreenKeys Project – the square has been renewed, a collaborative and participative green space development has been fulfilled. The authors were engaged in this model programme, they attempt to summarize briefly the experiences of this unique project of Budapest. The local residents were successfully involved into the planning and the implementation of the project. The participation was considerably efficient, however the experience shows that a participative project may be shorter than the project leaders thought. As a result of this activities the Urban Green Space Strategy of Józsefváros and a computer program for monitoring of green spaces were compiled as well
Inferring the interplay of network structure and market effects in Bitcoin
A main focus in economics research is understanding the time series of prices
of goods and assets. While statistical models using only the properties of the
time series itself have been successful in many aspects, we expect to gain a
better understanding of the phenomena involved if we can model the underlying
system of interacting agents. In this article, we consider the history of
Bitcoin, a novel digital currency system, for which the complete list of
transactions is available for analysis. Using this dataset, we reconstruct the
transaction network between users and analyze changes in the structure of the
subgraph induced by the most active users. Our approach is based on the
unsupervised identification of important features of the time variation of the
network. Applying the widely used method of Principal Component Analysis to the
matrix constructed from snapshots of the network at different times, we are
able to show how structural changes in the network accompany significant
changes in the exchange price of bitcoins.Comment: project website: http://www.vo.elte.hu/bitcoi
Race, Religion and the City: Twitter Word Frequency Patterns Reveal Dominant Demographic Dimensions in the United States
Recently, numerous approaches have emerged in the social sciences to exploit
the opportunities made possible by the vast amounts of data generated by online
social networks (OSNs). Having access to information about users on such a
scale opens up a range of possibilities, all without the limitations associated
with often slow and expensive paper-based polls. A question that remains to be
satisfactorily addressed, however, is how demography is represented in the OSN
content? Here, we study language use in the US using a corpus of text compiled
from over half a billion geo-tagged messages from the online microblogging
platform Twitter. Our intention is to reveal the most important spatial
patterns in language use in an unsupervised manner and relate them to
demographics. Our approach is based on Latent Semantic Analysis (LSA) augmented
with the Robust Principal Component Analysis (RPCA) methodology. We find
spatially correlated patterns that can be interpreted based on the words
associated with them. The main language features can be related to slang use,
urbanization, travel, religion and ethnicity, the patterns of which are shown
to correlate plausibly with traditional census data. Our findings thus validate
the concept of demography being represented in OSN language use and show that
the traits observed are inherently present in the word frequencies without any
previous assumptions about the dataset. Thus, they could form the basis of
further research focusing on the evaluation of demographic data estimation from
other big data sources, or on the dynamical processes that result in the
patterns found here
- …